Aggregate Values
Counting Messages
Here is an example of a query to count messages in a stream. This query may look like a similar one in SQL, except in QQL count ()
takes no arguments:
==> select count () from securities
>_,TIMESTAMP,SYMBOL,TYPE,COUNT ()
0,2011-10-26 18:47:18.381,GOOG,EQUITY,6
The timestamp
and the entity identity
that you receive belong to the last scanned message.
Add the group by entity
clause to count messages by symbol
. We will run this example on the tickquerydemo stream for illustration purposes:
==> select count () from tickquerydemo group by entity
>_,TIMESTAMP,SYMBOL,TYPE,COUNT ()
0,2011-10-17 17:21:44,GREATCO,EQUITY,3
1,2011-10-17 17:21:43,XBANK,EQUITY,4
group by
clause caused a separate state object to be created for each entity. Therefore, the counts were tracked for each entity independently. Here is the diagram to illustrate this:
Selecting Extremes
Similar to counting, the functions max ()
and min ()
are available for selecting extreme values. These functions can be applied to the majority of data types, including even BOOLEAN
. Here are two examples of using max
and min
functions in which we will look for the global range of values in a column, one range for all entities:
==> select min(price), max(price) from tickquerydemo
>_,TIMESTAMP,SYMBOL,TYPE,MIN (price),MAX (price)
0,2011-10-17 17:21:44,GREATCO,EQUITY,44.0,301.25
And here is how you find the range of values by entity:
==> select min(bidPrice), max(bidPrice) from tickquerydemo group by entity
>_,TIMESTAMP,SYMBOL,TYPE,MIN (bidPrice),MAX (bidPrice)
0,2011-10-17 17:21:44,GREATCO,EQUITY,42.5,43.0
1,2011-10-17 17:21:43,XBANK,EQUITY,295.0,301.25
Of course, you can apply aggregate functions to calculated expressions:
==> select max (offerPrice - bidPrice) as maxspread from tickquerydemo group by entity
>_,TIMESTAMP,SYMBOL,TYPE,MAXSPREAD
0,2011-10-17 17:21:44,GREATCO,EQUITY,2.0
1,2011-10-17 17:21:43,XBANK,EQUITY,4.5
Running Totals
As we have mentioned earlier, as soon as you use aggregate function, QQL switches to the aggregate query mode, in which it scans the input data aggregating the values, and, at the end, dumps the content of each aggregated Query State as result.
Add running
to the query to suppress switching to the aggregate mode and output the running aggregate value in the result.
==> select running count (), name from securities
>_,TIMESTAMP,SYMBOL,TYPE,COUNT (),name
0,2011-04-06 01:47:24.194,ESZ11,FUTURE,1,S&P 500 E-Mini Dec11
1,2011-04-06 01:47:40.790,NQZ11,FUTURE,2,Nasdaq 100 E-Mini Dec11
2,2011-04-06 01:50:55.354,ESU11,FUTURE,3,S&P 500 E-Mini Sep11
3,2011-04-06 01:51:00.442,NQU11,FUTURE,4,Nasdaq 100 E-Mini Sep11
4,2011-10-26 18:45:11.952,AAPL,EQUITY,5,Apple Inc
5,2011-10-26 18:47:18.381,GOOG,EQUITY,6,Google
Here is how you can select data while counting messages by entity:
==> select running count (), price, bidPrice, offerPrice from tickquerydemo group by entity
>_,TIMESTAMP,SYMBOL,TYPE,COUNT (),price,bidPrice,offerPrice
0,2011-10-17 17:21:40,GREATCO,EQUITY,1,_,42.5,43.5
1,2011-10-17 17:21:40,XBANK,EQUITY,1,_,301.25,301.75
2,2011-10-17 17:21:41,XBANK,EQUITY,2,301.25,301.25,301.75
3,2011-10-17 17:21:42,XBANK,EQUITY,3,301.25,298.5,301.5
4,2011-10-17 17:21:43,GREATCO,EQUITY,2,_,43.0,45.0
5,2011-10-17 17:21:43,XBANK,EQUITY,4,301.25,295.0,299.5
6,2011-10-17 17:21:44,GREATCO,EQUITY,3,44.0,43.0,45.0
You can also select running extremes, and any other aggregate functions available in QQL.
Finding First and Last Matches
Use select first (*)
in your query to find the first message matching a certain criteria, or simply the first message in a stream:
==> select first (*) from tickquerydemo
>deltix.qsrv.hf.pub.BestBidOfferMessage,TIMESTAMP,SYMBOL,TYPE,offerPrice,offerSize,bidPrice,bidSize
0,2011-10-17 17:21:40,GREATCO,EQUITY,43.5,100.0,42.5,200.0
You can group the results by entity
:
==> select first (*) from tickquerydemo group by entity
>deltix.qsrv.hf.pub.BestBidOfferMessage,TIMESTAMP,SYMBOL,TYPE,offerPrice,offerSize,bidPrice,bidSize
0,2011-10-17 17:21:40,GREATCO,EQUITY,43.5,100.0,42.5,200.0
1,2011-10-17 17:21:40,XBANK,EQUITY,301.75,40000.0,301.25,800.0
When you select first (*) ... group by
, the query will scan the entire stream, even if it is very large. If you do not group by
, the query will immediately stop after finding and returning the first matching record.
You can also add conditions to the search query:
==> select first (*) from tickquerydemo where symbol='XBANK' and offerPrice < 301.6
>deltix.qsrv.hf.pub.BestBidOfferMessage,TIMESTAMP,SYMBOL,TYPE,offerPrice,offerSize,bidPrice,bidSize
0,2011-10-17 17:21:42,XBANK,EQUITY,301.5,60000.0,298.5,800.0